AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Lightweight multimodal

# Lightweight multimodal

Smolvlm Instruct GGUF
Apache-2.0
SmolVLM is a compact open-source multimodal model that can accept image and text inputs and generate text outputs. It is designed for high efficiency and is suitable for device-side applications.
Image-to-Text Transformers English
S
Mungert
1,023
2
Smolvlm2 2.2B Instruct
Apache-2.0
SmolVLM2-2.2B is a lightweight multimodal model designed for analyzing video content. It can process video, image, and text inputs and generate text outputs.
Image-to-Text Transformers English
S
HuggingFaceTB
62.56k
164
Uform Gen2 Qwen 500m
Apache-2.0
UForm-Gen is a small generative vision-language model primarily used for image caption generation and visual question answering.
Image-to-Text Transformers English
U
unum-cloud
17.98k
76
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase